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Abstract. We survey connections between the theory of bi-Lipschitz embeddings and 
the Sparsest Cut Problem in combinatorial optimization. The story of the Sparsest 
Cut Problem is a striking example of the deep interplay between analysis, geometry, 
and probability on the one hand, and computational issues in discrete mathematics on 
the other. We explain how the key ideas evolved over the past 20 years, emphasizing 
the interactions with Banach space theory, geometric measure theory, and geometric 
group theory. As an important illustrative example, we shall examine recently established 
connections to the the structure of the Heisenberg group, and the incompatibility of its 
Carnot-Caratheodory geometry with the geometry of the Lebesgue space L%. 
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1. Introduction 

Among the common definitions of the Heisenberg group H, it will be convenient 
for us to work here with H modeled as K 3 , equipped with the group product 
(a, b, c) • (a', b', d) = (a + a' ,b + b' , c + d + ab' — ba'). The integer lattice Z 3 is 
then a discrete cocompact subgroup of H, denoted by H(Z), which is generated 
by the finite symmetric set {(±1, 0, 0), (0, ±1, 0), (0, 0, ±1)}. The word metric on 
H(Z) induced by this generating set will be denoted by dw- 

As noted by Semmes , a differentiability result of Pansu jU] implies that the 
metric space (M.(Z), dw) does not admit a bi-Lipschitz embedding into K ra for any 
n e N. This was extended by Pauls [52] to bi-Lipschitz non-embeddability results 
of (M(Z),dw) into metric spaces with either lower or upper curvature bounds in 
the sense of Alexandrov. In [52] [27] it was observed that Pansu's differentiabil- 
ity argument extends to Banach space targets with the Radon-Nikodym property 
(see P3] Ch. 5]), and hence H(Z) does not admit a bi-Lipschitz embedding into, 
say, a Banach space which is either reflexive or is a separable dual; in particular 
H(Z) does not admit a bi-Lipschitz embedding into any L p (^l) space, 1 < p < oo, 
or into the sequence space t\. 
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The embeddability of H(Z) into the function space Li(/j,), when fj, is non-atomic, 
turned out to be much harder to settle. This question is of particular importance 
since it is well understood that for fi non-atomic, £i(a*) is a space for which the dif- 
ferentiability results quoted above manifestly break down. Nevertheless, Cheeger 
and Kleiner [26l [25] introduced a novel notion of differentiability for which they 
could prove a differentiability theorem for Lipschitz maps from the Heisenberg 
group to Li(/i), thus establishing that H(Z) does not admit a bi-Lipschitz embed- 
ding into any Li(/i) space. 

Another motivation for the embeddability question for H(Z) originates 

from [52] , where it was established that it is connected to the Sparsest Cut Problem 
in the field of combinatorial optimization. For this application it was of impor- 
tance to obtain quantitative estimates in the non-embeddability results for 
H(Z). It turns out that establishing such estimates is quite subtle, as they require 
overcoming finitary issues that do not arise in the infinite setting of [25, 28 . The 
following two theorems were proved in [29l [30] . Both theorems follow painlessly 
from a more general theorem that is stated and discussed in Section [5.41 

Theorem 1.1. There exists a universal constant c > such that any embedding 
into Li(/j,) of the restriction of the word metric dw to the nxnxn grid {1, . . . , n} 3 
incurs distortion > (logn) c . 

Following Gromov [38], the compression rate of / : H(Z) — s- denoted 
^/(-), is defined as the largest non-decreasing function such that for all x,y £ H(Z) 
we have \\f(x) — f(y)\\i ^ ^f(dw(x,y)) (see [7] for more information on this topic). 

Theorem 1.2. There exists a universal constant c > such that for every function 
f : H(Z) — > Li{n) which is 1-Lipschitz with respect to the word metric dw , we have 
< V( lo S*) C for all t ^ 2. 

Evaluating the supremum of those c > for which Theorem 11.11 holds true 
remains an important open question, with geometric significance as well as impor- 
tance to theoretical computer science. Conceivably we could get c in Theorem ll.il 
to be arbitrarily close to ^, which would be sharp since the results of [H1IM] imply 
(see the explanation in [IT]) that the metric space ({1, . . . ,n} 3 ,dw) embeds into 
i\ with distortion < \/\ogn. Similarly, we do not know the best possible c in 
Theorem 11.21 i is again the limit here since it was shown in [69] that there exists 
a 1-Lipschitz mapping / : H(Z) — > t\ for which w/(i) > t/(y / IogT- log log t). 

The purpose of this article is to describe the above non-embeddability results for 
the Heisenberg group. Since one of the motivations for these investigations is the 
application to the Sparsest Cut Problem, we also include here a detailed discussion 
of this problem from theoretical computer science, and its deep connections to 
metric geometry. Our goal is to present the ideas in a way that is accessible to 
mathematicians who do not necessarily have background in computer science. 

Acknowledgements. I am grateful to the following people for helpful comments 
and suggestions on earlier versions of this manuscript: Tim Austin, Keith Ball, 
Subhash Khot, Bruce Kleiner, Russ Lyons, Manor Mendel, Gideon Schechtman, 
Lior Silberman. 
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2. Embeddings 

A metric space (^#, dj() is said to embed with distortion D 1 into a metric space 
,dy) if there exists a mapping / : Jit — > W , and a scaling factor s > 0, such that 
for all x, y € ^# we have sd^{x, y) ^ day(f(x), f(y)) Dsd^(x, y). The infimum 
over those D 1 for which (^(,d^) embeds with distortion D into is 
denoted by c^(^#). If (^,d^) does not admit a bi-Lipschitz embedding into 
d-^), we will write c^(^#) = oo. 

Throughout this paper, for p ^ 1, the space L p will stand for L p ([0, 1],A), 
where A is Lebesgue measure. The spaces ^ p and £™ will stand for the space of 
p-summable infinite sequences, and R n equipped with the £ p norm, respectively. 
Much of this paper will deal with bi-Lipschitz embeddings of finite metric spaces 
into L p . Since every n-point subset of an L p (f2, y) space embeds isometrically into 
f/n{n-i)/2 ^ gee discussion in [H]), when it comes to embeddings of finite metric 
spaces, the distinction between different L p (fl, u) spaces is irrelevant. Nevertheless, 
later, in the study of the embeddability of the Heisenberg group, we will need to 
distinguish between sequence spaces and function spaces. 

For p > 1 we will use the shorter notation c p (^#) = cl p {^)- The parameter 
C2(^) is known as the Euclidean distortion of Dvoretzky's theorem says that 
if *3f is an infinite dimensional Banach space then c^(€ 2 l ) = 1 for all n G N. Thus, 
for every finite metric space j& and every infinite dimensional Banach space & ' , 
we have c-i{y^) ^ cg^(^#). 

The following famous theorem of Bourgain j!5j will play a key role in what 
follows: 

Theorem 2.1 (Bourgain's embedding theorem |15|). For every n-point metric 
space djf), we have 

Ci{M) <logn. (1) 

Bourgain proved in [15] that the estimate (fTJ is sharp up to an iterated loga- 
rithm factor, i.e., that there exist arbitrarily large n-point metric spaces for 
which C2(^d) > lo'g log n • l°g l°g n term was removed in the important pa- 

per [56] of Linial, London and Rabinovich, who showed that the shortest path met- 
ric on bounded degree n- vertex expander graphs has Euclidean distortion > log n. 

If one is interested only in embeddings into infinite dimensional Banach spaces, 
then Theorem l2.1l is stated in the strongest possible form: as noted above, it implies 
that for every infinite dimensional Banach space we have c^(^#) < logn. 
Below, we will actually use Theorem 12.11 for embeddings into L\, i.e., we will use 
the fact that ci(^#) < logn. The expander based lower bound of Linial, London 
and Rabinovich [56] extends to embeddings into L\ as well, i.e., even this weaker 
form of Bourgain's embedding theorem is asymptotically sharp. We refer to [58l 
Ch. 15] for a comprehensive discussion of these issues, as well as a nice presentation 
of the proof of Bourgain's embedding theorem. 
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3. Li as a metric space 



Let (Q, n) be a measure space. Define a mapping T : Li(fl, /i) — > Loo(f2 xM, /j, x A), 
where A is Lebesgue measure, by: 



T(J)(w,x) 

For all f,g & Li(J7 ;/ u) we have: 

T(/)( W ,x)-T( 3 )( W ,x) 
Thus, for all p > we have, 

im/)-n<7)ll£ y( nx R ,MxA) 



clef 



1 < x < /(w), 
-f /(w)<a:<0, 
otherwise. 



1 < x ^ /(w) or /(w) < x < 

otherwise. 



l(g(w),f(v)Mf( u ),g(ui)] 

|/(w)-sH|dA*(w) = ||/-fl|U l(n ,^. (2) 

u 

Specializing ([2]) to p = 2, we see that: 

l|7 1 (/)-r(< 7 )|| i2(0xRiAtxA 



Corollary 3.1. TTie metric space ^Li(f2,/^), ||/ — ffH^Q ^ admits an isometric 
embedding into Hilbert space. 

Another useful corollary is obtained when ([2]) is specialized to the case p = 1. 
Take an arbitrary finite subset X C Li(fi, /i). For every € 17 x R consider 

the set S^WjX) = {/ 6 X : x < /(^)} ^ X- For every 5 C X we can define 
a measurable subset £?s = {(^j^) G x K : S^ir) = S} C SI x I. By the 
definition of T, for every f,g £ X we have 

H/-slUi(n, M ) = -T(5)|| il (nxK,pxA) 

(/) - 1s(«j,x)(5) d(/j, x A)(w,x) 

£ ( M x A)(£ s ) l s (/) - l s (g) , 



12xR 



sex 



where here, and in what follows, ls( - ) is the characteristic function of S. Writing 
Ps = (A* x A) (-Eg), w e have the following important corollary: 

Corollary 3.2. Let X C Li(f2,/i) &e a finite subset of L±(Q, (i). Then there exist 
nonnegative numbers {/?s}scx C [0, oo) smc/i that for all f,g *E X we have: 



sex 



(3) 
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A metric space (..•#, d^r) is said to be of negative type if the metric space 
(^J^, d^^j admits an isometric embedding into Hilbert space. Such metrics will 

play a crucial role in the ensuing discussion. This terminology (see e.g., [33] ) 
is due to a classical theorem of Schoenberg [65], which asserts that (^#, djz) is 
of negative type if and only if for every n G N and every x\, . . . ,x n G X, the 
matrix Xj-))™,- =1 is negative semidefinite on the orthogonal complement of 

the main diagonal in C", i.e., for all £i, . . . , £ n G C with Ej=i Cj = we have 
E£=i Ej=i CiCjd^(x i: Xj) ^ 0. Corollary (|3.1[) can be restated as saying that 
Li(f2, fi) is a metric space of negative type. 

Corollary Q3.2p is often called the cut cone representation of L\ metrics. To 
explain this terminology, consider the set ^€ C W 1 of all n x n real matrices 
j4 = (fljj) such that there is a measure space (O, /i) and /i, ••-,/« G Li(0, /i) 
with ay = HZ; - /jUi^n,^) for all i,j G {l,...,n}. If fi,...,f n G £i(^i,A*i) and 
gi,...,g n G Li(5^2, /12) then for all ci, C2 and i, j G {1, . . . , n} we have 

ClU/i - + c 2||/i - /il|ii(n 2 ,Ma) = ll^i _ ^illiiCniUfia./iiUpz)) 

where hi,...,h n are functions defined on the disjoint union fii U ^ as follows: 
= Ci/i(w)la 1 (w) + C2gi(o;)ln 2 (w). This observation shows that ^ is a cone 
(of dimension n(n — l)/2). Identity ([3]) says that the cone ^ is generated by the 
rays induced by cut semimetrics, i.e., by matrices of the form ay = |ls(*) — 1 s ( J ) | 
for some S C {1, . . . , n}. It is not difficult to see that these rays are actually the 
extreme rays of the cone ^ . Caratheodory's theorem (for cones) says that we can 
choose the coefficients {/3s} sex in ([3]) so that only n(n — 1)/2 of them are non-zero. 



4. The Sparsest Cut Problem 



Given n G N and two symmetric functions C, D : {1, . . . , n} x {1, . . . , n} — > [0, oo) 
(called capacities and demands, respectively), and a subset ^ S C {1, ...,n}, 
write 



dot Er=i£-=iC(^)-|is«-is(j)| 



The value 



[ j Er=iE- =1 ^,i)-|i S «-i s (i)r ( j 



$*(C,^) d = f min $(<?) (5) 

0^SC{l,...,n} 



is the minimum over all cuts (two-part partitions) of {1, . . . , n} of the ratio between 
the total capacity crossing the boundary of the cut and the total demand crossing 
the boundary of the cut. 

Finding in polynomial time a cut for which $*(C, D) is attained up to a definite 
multiplicative constant is called the Sparsest Cut problem. This problem is used 
as a subroutine in many approximation algorithms for NP-hard problems; see the 
survey articles [68] [22] , as well as [53] Q] and the references in [6] [5] for some of the 
vast literature on this topic. Computing $*(C, D) exactly has been long known to 



6 



Assaf Naor 



be NP-hard [ST]- More recently, it was shown in [3T] that there exists So > such 
that it is NP-hard to approximate $* (C, D) to within a factor smaller than 1 + £o • 
In [17] [53] it was shown that it is Unique Games hard to approximate $*(C, D) to 
within any constant factor (see [44] [45] for more information on the Unique Games 
Conjecture; we will return to this issue in Section l4.3.3|) . 

It is customary in the literature to highlight the support of the capacities func- 
tion C: this allows us to introduce a particulary important special case of the 
Sparsest Cut Problem. Thus, a different way to formulate the above setup is via 
an n- vertex graph G — (V,E), with a positive weight (called a capacity) C(e) as- 
sociated to each edge e € E, and a nonnegative weight (called a demand) D(u,v) 
associated to each pair of vertices u, v 6 V. The goal is to evaluate in polynomial 
time (and in particular, while examining only a negligible fraction of the subsets 
of V) the quantity: 



$*(C,D) = min 



E uv eEC(uv)\ls(u)-ls(v)\ 
^scv T, u ,vev D ( u ' v ) IMw) " M«)l ' 

To get a feeling for the meaning of $*, consider the case C(e) = D(u,v) = 1 for 
all e € E and u, v £ V. This is an important instance of the Sparsest Cut problem 
which is called "Sparsest Cut with Uniform Demands". In this case 4>* becomes: 

. #{edges joining S and V \ S} 

*P = min 

^scv \S\-\V\S\ 

Thus, in the case of uniform demands, the Sparsest Cut problem essentially amounts 
to solving efficiently the combinatorial isoperimetric problem on G: determining 
the subset of the graph whose ratio of edge boundary to its size is as small as 
possible. 

In the literature it is also customary to emphasize the size of the support of 
the demand function D, i.e., to state bounds in terms of the number k of pairs 
{hj} Q {Ij ■ ■ ■ ,n} for which D(i,j) > 0. For the sake of simplicity of exposition, 
we will not adopt this convention here, and state all of our bounds in terms of 
n rather than the number of positive demand pairs k. We refer to the relevant 
references for the simple modifications that are required to obtain bounds in terms 
of k alone. 

,iFrom now on, the Sparsest Cut problem will be understood to be with general 
capacities and demands; when discussing the special case of uniform demands we 
will say so explicitly. In applications, general capacities and demands are used to 
tune the notion of "interface" between S and V\S to a wide variety of combinatorial 
optimization problems, which is one of the reasons why the Sparsest Cut problem 
is so versatile in the field of approximation algorithms. 

4.1. Reformulation as an optimization problem over L 1 . Al- 
though the Sparsest Cut Problem clearly has geometric flavor as a discrete isoperi- 
metric problem, the following key reformulation of it, due to [11 [ I56 j . explicitly 
relates it to the geometry of L\. 
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Lemma 4.1. Given symmetric C, D : {1, . . . , n} x {1, . . . , n} — > [0, oo), 



we nave: 



$*(£7,£>) = mm » J n — — — . (6) 

/i,->/»eia E 4 =iE J =i £, (*:j)ll/ i -fjh 

Proof. Let </> denote the right hand side of ([6]), and write $* = $*(C, D). Given 
a subset S 1 C {1, . . . , n}, by considering fi = ls(i) € {0, 1} C Li we see that that 
<p ^ $*. In the reverse direction, if X = {/i, ...,/„} C Li then let {&}scx 
be the non-negative weights from Corollary 13.21 For S C X define a subset of 
{1, . . . , n} by S' = {% G {1, . . . , n} : fa G S}. It follows from the definition of $* 
that for all S C X we have, 

n n n n 

i=l j — 1 i=l j = l 

J5l n n 

H^^^iJIM/ij-is^)!. (7) 
i=i j=i 



Thus 



££c(i,j)ii/,-/iiii^ E^EE^'^iM/i)- 

i=l j = l SCX i=l j = l 

ran n n 

> £ /3sEE £> ( i ^')i 1 ^^)- 1 ^^)i g EE £, ( i ^')ii^-^iii- 

SCX i=l j = l i=l j=l 

It follows that <fr ^ $*, as required. □ 

4.2. The linear program. Lemma is a reformulation of the Sparsest 
Cut Problems in terms of a continuous optimization problem on the space L\. 
Being a reformulation, it shows in particular that solving L\ optimization problems 
such as the right hand side of (JSJ) is NP-hard. 

In the beautiful paper |53j of Leighton and Rao it was shown that there exists a 
polynomial time algorithm that, given an n- vertex graph G = {V, E), computes a 
number which is guaranteed to be within a factor of < log n of the uniform Sparsest 
Cut value (j4]) . The Leighton- Rao algorithm uses combinatorial ideas which do not 
apply to Sparsest Cut with general demands. A breakthrough result, due to Linial- 
London-Rabinovich [56] and Aumann-Rabani [9 , introduced embedding methods 
to this field, yielding a polynomial time algorithm which computes $* (C, D) up to 
a factor < logn for all C, D : {1, . . . , n} x {1, . . . , n} —> [0, 00). 

The key idea of [SHI [H] is based on replacing the finite subset ••,/«} 
of L\ in ^ by an arbitrary semimetric on {l,...,n}. Specifically, by homo- 
geneity we can always assume that the denominator in (|6]) equals 1, in which 
case Lemma T4. II says that $*(C, D) equals the minimum of Yj7=i Sj=i C(h 
given that Yn=i S?=i D(i,j)dij = 1 and there exist fi,...,f n G L\ for which 
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dij = || fi — fj || i for alH, j £ { 1, . . . , n} . We can now ignore the fact that dij was a 
semimetric that came from a subset of L\, i.e., we can define M*(C, -D) to be the 
minimum of £? =1 £* =1 C(i,j)dij, given that ^=1 E"=i D(i,j)dij = 1, d« = 0, 

^ 0, = dji for all i, j € {1, . . . , n} (n(n — l)/2 symmetry constraints) and 
rfjj ^ dik + dkj for all i,j, fc 6 {1, . . . ,n} n 3 triangle inequality constraints). 

Clearly M*(C,D) ^ 3>*(C, D), since we are minimizing over all semimetrics 
rather than just those arising from subsets of L\. Moreover, M*(C,D) can be 
computed in polynomial time up to arbitrarily good precision [40], since it is a 
linear program (minimizing a linear functional in the variables (dij) subject to 
polynomially many linear constraints). 

The linear program produces a semimetric d*j on {1, ...,ra} which satisfies 
M*(C,D) = Y%=i E"=iC(t> j)dij and J2?=i E"=i D (hj)d*j = 1 (ignoring arbi- 
trarily small errors). By Lemma [4.11 we need to somehow relate this semimetric 
to L\. It is at this juncture that we see the power of Bourgain's embedding theo- 
rem [2TTJ the constraints of the linear program only provide us the information that 
d*j is a semimetric, and nothing else. So, we need to be able to somehow handle 
arbitrary metric spaces — precisely what Bourgain's theorem does, by furnishing 
/17 • ■ • , In £ L\ such that for all i, j € {1, . . . , n} we have 



lo 



%Z\\fi-fih<d* j . (8) 



Now, 

$*(C,£>) < 



f £?=i£"=iC(M)ll/j-/jlli 

n=i£Li^i,j)ii/i-/iiii 



Thus, < M*(C,D) < $*(C,D), i.e., the polynomial time algorithm of 

computing M*(C,D) is guranteed to produce a number which is within a factor 
< logn of $*(C, D). 

Remark 4.2. In the above argument we only discussed the algorithmic task of 
fast estimation of the number $*(C,D), rather than the problem of producing 
in polynomial time a subset ^ S C { 1 , . . . , n} for which $* (S) is close up to a 
certain multiplicative guarantee to the optimum value $* (C, D). All the algorithms 
discussed in this paper produce such a set S, rather than just approximating the 
number $* (C, D). In order to modify the argument above to this setting, one needs 
to go into the proof of Bourgain's embedding theorem, which as currently stated 
as just an existential result for fx , . . . , /„ as in ((HJ- This issue is addressed in [56] , 
which provides an algorithmic version of Bourgain's theorem. Ensuing algorithms 
in this paper can be similarly modified to produce a good cut 5*, but we will ignore 
this issue from now on, and continue to focus solely on algorithms for approximate 
computation of $* (C, D). 
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4.3. The semidefinite program. We have already stated in Section [U 
that the logarithmic loss in the application ((5J of Bourgain's theorem cannot be 
improved. Thus, in order to obtain a polynomial time algorithm with approxi- 
mation guarantee better than < logn, we need to impose additional geometric 
restrictions on the metric d% ; conditions that will hopefully yield a class of met- 
ric spaces for which one can prove an L\ distortion bound that is asymptotically 
smaller than the < logn of Bourgain's embedding theorem. This is indeed possi- 
ble, based on a quadratic variant of the discussion in Section [4.2( an approach due 
to Goemans and Linial [32l [55j [54] . 

The idea of Goemans and Linial is based on Corollary 13. 1[ i.e., on the fact that 
the metric space L\ is of negative type. We define M**(C, D) to be the minimum 
of Yh=i Sj=i C{i,j)dij, subject to the constraint that Yl7=i Sj=i D {^3)dij = 1 
and dij is a semimetric of negative type on {1, . . . , n}. The latter condition can be 
cquivalently restated as the requirement that, in addition to d\j being a semimetric 
on {1, . . . , n}, there exist vectors vi, ...,«„€ L% such that dij = \\V{ — fjHl for all 
i, j G {1, . . . , n}. Equivalently, there exists a symmetric positive semidefinite nx n 
matrix (ay) (the Gram matrix of v\, . . . ,V n ), such that dij = an + ajj — 2a^ for 
alH,j S {!,..., n}. 



Thus, M**(C.D) is the minimum of Yl7=i 2f=i j)( a a + a jj ~ 2a 



*3/ 



■A 



linear function in the variables (<%•), subject to the constraint that (o^-) is a 
symmetric positive semidefinite matrix, in conjunction with the linear constraints 
J27=i Z)£=i D (i,j)(au + ajj - 2ay) = 1 and for all i, j, k € {1, . . . , n}, the triangle 
inequality constraint an + ajj — 2u,j (an + a^k — 2a<ik) + (afcfc + a>jj — 2a;y). Such 
an optimization problem is called a semidefinite program, and by the methods 
described in [40], M**(C, D) can be computed with arbitrarily good precision in 
polynomial time. 

Corollary O and Lemma ED imply that M*(C, D) < M**(C, D) ^ $*(C, D). 
The following breakthrough result of Arora, Rao and Vazirani [6] shows that for 
Sparsest Cut with uniform demands the Goemans-Linial approach does indeed 
yield an improved approximation algorithm: 

Theorem 4.3 ( 6 ]). In the case of uniform demands, i.e., ifC(i,j) £ {0,1} and 
D(i,j) = 1 for all i,j £ {1, . . . , n}, we have 

$*(C D) 

- }——> < M**(C,D) < $*(C,£>). (10) 
Vfogn 

In the case of general demands we have almost the same result, up to lower 
order factors: 

Theorem 4.4 ( 5 ). For all symmetric C,D : {1, . . . , n} X {1, . . . , n} — > [0, oo) we 

have 

n ** { °l+L £ M**(C,D) < $*(C,D). (11) 

The o(l) term in ((TTJ) is < '"fogfo^n" - We conjecture that it could be removed 
altogether, though at present it seems to be an inherent artifact of complications 
in the proof in [5]. 
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Before explaining some of the ideas behind the proofs of Theorem 14.31 and 
Theorem 14.41 (the full details are quite lengthy and are beyond the scope of this 
survey), we prove, following [551 Prop. 15.5.2], a crucial identity (attributed in [55] 
to Y. Rabinovich) which reformulates these results in terms of an L\ embeddability 
problem. 

Lemma 4.5. We have 
f $*(C D) 

SnP \ M**(C D) '■ C = £>: { 1 »---^}x{l,...,n}^(0,cx)) 

= sup |ci ({1, . . . , n}, d) : d is a metric of negative type j. (12) 

Proof. The proof of the fact that the left hand side of (fT2"j) is at most the right 
hand side of (fl~2j) is identical to the way ([9]) was deduced from flS). 

In the reverse direction, let d* be a metric of negative type on {1, . . . ,n} for 

which Ci({l, . . . , n}, d*) = c is maximal among all such metrics. Let c € C W 
be the cone in the space of n x n symmetric matrices from the last paragraph 
of Section [3j i.e., consists of all matrices of the form (||/j — fj\\i) for some 
/l, • • • ,/n € Li. ^ 

Fix e € (0, c — 1) and let C R™ be the set of all symmetric matrices (a^) 
for which there exists s > such that sd*(i,j) ^ a,j ^ (c — e)sd*(i,j) for all 

€ {1, . . . ,n}. By the definition of c, the convex sets ^ and J£J are disjoint, 
since otherwise d* would admit an embedding into L± with distortion c — e. It 
follows that there exists a symmetric matrix (/iL) € M n \ {0} and a G M, such 

that Y%=1 Ei=i ^ii a ij ^ a for a11 ( a *i) e an d £™=i Z)"=i Mjbij > " for all 
{bij) € Since both ^ and are closed under multiplication by positive scalars, 
necessarily a = 0. 

Define C £ (i, j) = hyl^^o} and D e {i,j) d = <o}- % definition of 

M**(C e ,D e ), 

n n n n 

gg^Ci.jXj >M**(C B ,l> e ).££i5 e (i,jX.. (13) 

By considering ay d = He - E)l{/if f >o} + l{/if.,«)}) d*(i,j) € J^, the inequality 
£™=i £™=i ^fi a y ^ becomes: 

n ri ri n 

££^(i,i)4>(c- e )x;x;c B (i,i)^. (w) 

i=l j — 1 i—1 j—1 

A combination of (H3J) and fUJ implies that (c - e)M **(C £ , £> £ ) < 1. At the same 
time, for all f u . . . , /„ € Li, the inequality YTj=i MjWfi ~ fjh ^ is tnc 

same as £™ =1 C £ (i, j)||/i - ^ £" =1 E*=i £> e (^ ~ /ilk which b Y 
Lemma El means that $*(C* £ ,L> £ ) > 1. Thus $*(C* £ ,L> £ )/M**(C £ ,L> £ ) > c - e, 
and since this holds for all e € (0, c — 1), the proof of Lemma [4751 is complete. □ 
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In the case of Sparsest Cut with uniform demands, we have the following result 
which is analogous to Lemma 14.51 where the L\ bi-Lipschitz distortion is replaced 
by the smallest possible factor by which 1-Lipschitz functions into Li can distort 
the average distance. The proof is a slight variant of the proof of Lemma 14.5) 
the simple details are left to the reader. This connection between Sparsest Cut 
with uniform demands and embeddings that preserve the average distance is due 
to Rabinovich 63 . 

Lemma 4.6. The supremum o/<I>* (C, D) /M** (C, D) over all instances of uniform 
demands, i.e., when C(i, j) € {0, 1} and D(i, j) = 1 for all i, j £ {1, . . . , n}, equals 
the infimum over A > such that for all metrics d on {1, ... ,n} of negative type, 
there exist /1, . . . , /„ € L\ satisfying ||/, - fj\\i ^ d(i,j) for all i,j G {1, . . ., n} 

andAY™=i E*=i II/* - fjh > E"=i E"=i d ihj)- 

4.3.1. L-2 embeddings of negative type metrics. The proof of Theorem 14.31 
in [6] is based on a clever geometric partitioning procedure for metrics of nega- 
tive type. Building heavily on ideas of [6], in conjunction with some substantial 
additional combinatorial arguments, an alternative approach to Theorem 14.31 was 
obtained in [59j , based on a purely graph theoretical statement which is of indepen- 
dent interest. We shall now sketch this approach, since it is modular and general, 
and as such it is useful for additional geometric corollaries. We refer to [59] for 
more information on these additional applications, as well as to [6] for the original 
proof of Theorem 14.31 

Let G — (V, E) be an n-vertex graph. The vertex expansion of G, denoted 
h(G), is the largest h ^ such that every Scy with l^l ^ n/2 has at least h\S\ 
neighbors in V \ S. The edge expansion of G, denoted a{G), is the largest a ^ 
such that for every S C V with |5| ^ n/2, the number of edges joining S and V\S 
is at least a\S\ ■ —. The main combinatorial statement of [59] relates these two 
notions of expansion of graphs: 

Theorem 4.7 (Edge Replacement Theorem [SS]). For every graph G = (V,E) 
with h(G) ^ \ there is a set of edges E' on V with a(V, E') > 1, and such that for 
every uv £ E' we have da{u, v) < y^log \ V\ . Here dc is the shortest path metric on 
G ( with respect to the original edge set E ), and all implicit constants are universal. 

It is shown in [5!] that the < yTogn bound on the length of the new edges in 
Theorem 14.71 is asymptotically tight. The proof of Theorem 14.71 is involved, and 
cannot be described here: it has two components, a combinatorial construction, as 
well a purely Hilbertian geometric argument based on, and simpler than, the orig- 
inal algorithm of [B] . We shall now explain how Theorem 14.71 implies Theorem 14.31 
(this is somewhat different from the deduction in [59 , which deals with a different 
semidefinite program for Sparsest Cut with uniform demands). 

Proof of Theorem \4-3\ assuming Theorem \4-7\ An application of (the easy direc- 
tion of) Lemma l4"l)l shows that in order to prove Theorem l4.3l it suffices to show that 
if (^#, d) is an n-point metric space of negative type, with 'Yl lx y^ji d( x i V) — 1; 
then there exists a mapping F : ^ — > M which is 1-Lipschitz and such that 
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J2 X v eJ( \F( X ) ~ F(y)\ > l/\/logra. In what follows we use the standard nota- 
tion for closed balls: for a; G ^ and i ^ 0, set B(x, t) = {y G ./# : <i(x, y) ^ £}. 

Choose x € .J? with i ^ ye ^r d (x ,y) = r d = min^^ ± E y e^ d (. x > v)- Then 

r < ^T,x, v e^ d ( x ^y) = *> implying 1 ^ ± Ej,e^r ^o, y) > \ £(x , 2)|, or 

|5(x ,2)| > n/2. Similarly |-B(x ,4)| > 3n/4. 

Assume first that ^ ^ x y< =B(x 4) 2/) ^ i ( tms wm ^e the easy case). Then 

1 = ^2 £ d ( x >y)<\ + ^^2 E aj ) + d(xo,»)) 

= 7 + — |^\fl(iBo,4)| + - V d(x ,y) «S V d(x Q ,y), 
4 n n * — ' 4 n z — ' 

y6.^\B(x ,4) ye^\B(x ,4) 

or 7T E JJ e^\s( a;o ,4) d (x ,y) > |. Define a 1-Lipschitz mapping F : J( -> K by 
F(x) = d(x,-B(x ,2)) = min yeB ( ;E0; 2) ti(x, y). The triangle inequality implies that 
for every ye ^ \ Z?(xo,4) we have F(y) ^ |d(y, Xo). Thus 

x,y<^J( y^J(\B(xQ,A) 

>^ E ^ d (v. a: o)>i = ^ E ^y)- 

yS^#\i3(a;o,4) x,yg^# 

This completes the easy case, where there is even no loss of 1/ \/log n (and we did 
not use yet the assumption that d is a metric of negative type). 

We may therefore assume from now on that 4j Y^, x ye g/ XQ 4) d(x,y) |. The 
fact that d is of negative type means that there are vectors {v x } xe ^ C L 2 such 
that d(x,y) = ||u x — •y^Hl for all x, y € 

We will show that for a small enough universal constant e > 0, there are two 
sets Si,S 2 C B(x ,4) such that |Si|, |S 2 | ^ en and d(Si,S 2 ) ^ eVVlogn. Once 
this is achieved, the mapping F : ^ — > K given by F(x) = d(x, Si) will satisfy 
£ E,,^ W*) " ^(y)l > • |5 3 |^ > as desired. 

Assume for contradiction that no such Si,S 2 exist. Define a set of edges Eq 

on B(x ,4) by E a d = |{x,y} C £(x ,4) : x ^ y A d{x,y) < e 2 /^/lognj. 

Our contrapositive assumption says that any two subsets Si, 5*2 C £?(xo,4) with 
I Si |, 1 52 1 ^ en ^ e|-B(xo,4)| are joined by an edge from Eq. By a (simple) general 
graph theoretical lemma (see (59) Lem 2.3]), this implies that, provided e ^ 1/10, 
there exists a subset V C B(xo,4) with |V| ^ (1 — e)|_B(xo,4)| > n, such that the 

graph induced by E on V, i.e., G = (v, E = E Q n QJ , has /i(G) ^ |. 

We are now in position to apply the Edge Replacement Theorem, i.e., Theo- 
rem [47TJ We obtain a new set of edges E' on V such that a(V,E') > 1 and for 
every xy G E' we have da(x,y) < \/log ra. The latter condition means that there 
exists a path {x = xo, xi, . . . , x m = y} QV such that m < ylogri and XjXj_i G _E 
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for every i £ {1, . . . , m}. By the definition of E, this implies that 

™ e 2 

xy £ E' =>• d(x,y) ^~S^ d(x % ,Xi-i) < e 2 . (15) 

IZi Vlogn 

It is a standard fact (the equivalence between edge expansion and a Cheeger 
inequality) that for every / : V — > L\ we have 

An E ^ ll/W- /(y)lli- ( 16 ) 

' ' xy£E' ' ' x,y£V 

For a proof of (|16p see [59, Fact 2.1]: this is a simple consequence of the cut cone 
representation, i.e., Corollary 13.21 since the identity ([3]) shows that it suffices to 
prove (fT6|) when f(x) = ls(x) for some S £ V, in which case the desired inequality 
follows immediately from the definition of the edge expansion a(V, E'). 

Since L 2 is isometric to a subset of L\ (see, e.g., [71]), it follows from (fl6|) and 
the fact that a(V, E') > 1 that 

E 11^-^112 

1 1 xy£E' 

^ ,4 E iK-M 2 >^ E v/^y. (17) 

x,y£V 

Now comes the point where we use the assumption J2 X y eB(x 4) d( x > v) ^ \- 
Since for any x,y £ i?(xo,4) we have d{x,y) ^ 8, it follows that the number 
of pairs (x,y) £ B(x ,'i) x B(x ,A) with d(x,y) ^ 1/8 is at least n 2 /64. Since 
\V\ ^ (1 — e)|B(a;o, 4)|, the number of such pairs which are also in V x V is at 
least jfe — Sen 2 > n 2 , provided e is small enough. Thus 4y ygy \/d(x, y) > 1, 
and (fP7|) becomes a contradiction for small enough e. □ 

Remark 4.8. The above proof of Theorem 14.71 used very little of the fact that 
d is a metric of negative type. In fact, all that was required was that d admits a 
quasisymmetric embedding into Li\ see |59j . 

It remains to say a few words about the proof of Theorem 14.41 Unfortunately, 
the present proof of this theorem is long and involved, and it relies on a variety of 
results from metric embedding theory. It would be of interest to obtain a simpler 
proof. Lemma 14.51 implies that Theorem 14.41 is a consequence of the following 
embedding result: 

Theorem 4.9 ([5]). Every n-point metric space of negative type embeds into Hilbert 
space with distortion < (logn)2+°( 1 ) . 

Theorem 14.91 improves over the previously known |23) bound of < (logn) 3 / 4 on 
the Euclidean distortion of n-point metric spaces of negative type. As we shall 
explain below, Theorem 14.91 is tight up to the o(l) term. 

The proof of Theorem 14.91 uses the following notion from [5] : 



H i 

e ~ T^Tf E Vd(x,y) = 

1 1 xyeE' 
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Definition 4.10 (Random zero-sets [5])- Fix A, £ > 0, and p € (0, 1). A metric 
space { Ji, d) is said to admit a random zero set at scale A, which is (^-spreading 
with probability p, if there is a probability distribution /i over subsets Z C ^ such 
that \i ({Z : y e Z A d(x, Z) ^ A/C}) ^ p for every x,y e ^ with y) ^ A. 
We denote by ((^;p) the least ( > such that for every A > 0, Ji admits a 
random zero set at scale A which is (^-spreading with probability p. 

The connection to metrics of negative type is due to the following theorem, 
which can be viewed as the main structural consequence of [6]. Its proof uses [6] 
in conjunction with two additional ingredients: an analysis of the algorithm of [6] 
due to [5U] , and a clever iterative application of the algorithm of [5] , due to [53] , 
while carefully reweighting points at each step. 

Theorem 4.11 (Random zero sets for negative type metrics). There exists a 
universal constant p > such that any n-point metric space (.-#, d) of negative 
type satisfies C,{^£\p) < y / Iog~n. 

Random zero sets are related to embeddings as follows. Fix A > 0. Let (./#, d) 
be a finite metric space, and fix S C M '. By the definition of ((S;p), there exists a 
distribution /i over subsets Z C S such that for every x,y <E S with d(x, y) ^ A we 
have fi({Z C S : y € Z A d(x, Z) ^ A/((S;p)}) ^ p. Define ip SA : Ji ->• L 2 (p) 
by ys,A(^) = d(x,Z). Then is 1-Lipschitz, and for every x,y € S* with 

d(x,y) ^ A, 

||^,a(x)-^,a(j/)|| L2(m) = ^ s [d(x 7 Z)~d(y,Z)} 2 d^Z)^j ' >-£^- (18) 

The remaining task is to "glue" the mappings {</?s,a : A > 0, S C ^} 
to form an embedding of ^# into Hilbert space with the distortion claimed in 
Theorem 14.91 A key ingredient of the proof of Theorem 14.91 is the embedding 
method called "Measured Descent" , that was developed in [48] . The results of [48] 
were stated as embedding theorems rather than a gluing procedure; the realization 
that a part of the arguments of [JS] can be formulated explicitly as a general "gluing 
lemma" is due to [50]. In [5] it was necessary to enhance the Measured Descent 
technique in order to prove the following key theorem, which together with fj 18() 
and Theorem 14.111 implies Theorem 14.91 See also [4] for a different enhancement 
of Measured Descent, which also implies Theorem 14.91 The proof of Theorem 14. 121 
is quite intricate; we refer to [5] for the details. 

Theorem 4.12. Let (^K,d) be an n-point metric space. Suppose that there is 
e G [1/2, 1] such that for every A > 0, and every subset S C there exists a 1- 
Lipschitz map ips,A '■ — > L 2 with \\ifs,A( x )~ ^S.A^Ib ^ A/(log \S\) e whenever 
x,y G S and d(x 7 y) ^ A. Then c 2 {^) < (logn) e loglogn. 

The following corollary is an obvious consequence of Theorem 14.91 due to the 
fact that L\ is a metric space of negative type. 

Corollary 4.13. Every X C L\ embeds into L 2 with distortion < (log |AT|)^ +0< ' 1 ' . 
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We stated Corollary 14.131 since it is of special importance: in 1969, Enflo [3~4"] 
proved that the Hamming cube, i.e., {0, l} k equipped with the metric induced 
from has Euclidean distortion yk. Corollary 14.131 savs that up to lower order 
factors, the Hamming cube is among the most non-Euclidean subset of L\. There 
are very few known results of this type, i.e., (almost) sharp evaluations of the 
largest Euclidean distortion of an n-point subset of a natural metric space. A 
notable such result is Matousek's theorem [57] that any n-point subset of the 
infinite binary tree has Euclidean distortion < ^/loglogn, and consequently, due 
to [50], the same holds true for n-point subsets of, say, the hyperbolic plane. This 
is tight due to Bourgain's matching lower bound [TB] for the Euclidean distortion 
of finite depth complete binary trees. 

4.3.2. The Goemans-Linial conjecture. Theorem 14.41 is the best known ap- 
proximation algorithm for the Sparsest Cut Problem (and Theorem 14.31 is the best 
known algorithm in the case of uniform demands) . But, a comparison of Lemma [4.5l 
and Theorem 14.91 reveals a possible avenue for further improvement: Theorem 14.91 
produces an embedding of negative type metrics into Li (for which the bound 
of Theorem 14.91 is sharp up to lower order factors), while for Lemma [4.51 all we 
need is an embedding into the larger space L%. It was conjectured by Goemans 
and Linial (see (37] [55] [54] and [58] pg. 379-380]) that any finite metric space 
of negative type embeds into L\ with distortion < 1. If true, this would yield, 
via the Goemans-Linial semidefinite relaxation, a constant factor approximation 
algorithm for Sparsest Cut. 

As we shall see below, it turns out that the Goemans-Linial conjecture is false, 
and in fact there exist [3D] arbitrarily large n-point metric spaces of neg- 
ative type for which Ci (./#„) (logn) c , where c is a universal constant. Due 
to the duality argument in Lemma I4.5| this means that the algorithm of Sec- 
tion [473] is doomed to make an error of at least (logn) c , i.e., there exist capacity 
and demand functions C ni D n : {l,...,n} x {1, ...,n} — > [0, oo) for which we 
have M** (C„, D n ) < $*(C„,_D„)/(logn) c . Such a statement is referred to in the 
literature as the fact that the integrality gap of the Goemans-Linial semidefinite 
relaxation of Sparsest Cut is at least (logn) c . 

4.3.3. Unique Games hardness and the Khot-Vishnoi integrality gap. 

Khot's Unique Games Conjecture [33] is that for every e > there exists a prime 
p = p(e) such that there is no polynomial time algorithm that, given n € N and 
a system of m-linear equations in n-variables of the form x\ — Xj = mod p for 
some Cij G N, determines whether there exists an assignment of an integer value 
to each variable Xi such that at least (1 — e)m of the equations are satisfied, or 
whether no assignment of such values can satisfy more than em of the equations 
(if neither of these possibilities occur, then an arbitrary output is allowed). This 
formulation of the conjecture is due to [46] , where it is shown that it is equivalent 
to the original formulation in [44j . The Unique Games Conjecture is by now a 
common assumption that has numerous applications in computational complexity; 
see the survey |45] (in this collection) for more information. 
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In [J71 [21] it was shown that the existence of a polynomial time constant factor 
approximation algorithm for Sparsest Cut would refute the Unique Games Conjec- 
ture, i.e., one can use a polynomial time constant factor approximation algorithm 
for Sparsest Cut to solve in polynomial time the above algorithmic task for linear 
equations. 

For a period of time in 2004, this computational hardness result led to a strange 
situation: either the complexity theoretic Unique Games Conjecture is true, or the 
purely geometric Goemans-Linial conjecture is true, but not both. In a remarkable 
tour de force, Khot and Vishnoi [47] delved into the proof of their hardness result 
and managed to construct from it a concrete family of arbitrarily large n-point 
metric spaces J£ n of negative type for which ci(^#„) > (loglogn) c , where c is 
a universal constant, thus refuting the Goemans-Linial conjecture. Subsequently, 
these Khot- Vishnoi metric spaces J% n were analyzed in [49] , resulting in the lower 
bound c\ {Mn) > log log n. Further work in [32] yielded a > log log n integrality gap 
for Sparsest Cut with uniform demands, i.e., "average distortion" L\ embeddings 
(in the sense of Lemma 14.61) of negative type metrics were ruled out as well. 

4.3.4. The Bretagnolle, Dacunha-Castelle, Krivine theorem and invari- 
ant metrics on Abelian groups. A combination of Schoenberg's classical char- 
acterization [65] of metric spaces that are isometric to subsets of Hilbert space, and 
a theorem of Bretagnolle, Dacunha-Castelle and Krivine [T8] (see also [70]). implies 
that if p € [1,2] and (X, || • \\x) is a separable Banach space such that the metric 
space (X, \\x — y\\ P x 2 ) is isometric to a subset of Hilbert space, then X is (linearly) 
isometric to a subspace of L p . Specializing to p = 1 we see that the Goemans-Linial 
conjecture is true for Banach spaces. With this motivation for the Goemans-Linial 
conjecture in mind, one notices that the Goemans-Linial conjecture is part of a 
natural one parameter family of conjectures which attempt to extend the theorem 
Bretagnolle, Dacunha-Castelle and Krivine to general metric spaces rather than 
Banach spaces: is it true that for p e [1,2) any metric space (^#, d) for which 
(^K,d p / 2 ) is isometric to a subset of L 2 admits a bi-Lipschitz embedding into L p l 
This generalized Goemans-Linial conjecture turns out to be false for all p £ [1,2); 
our example based on the Heisenberg group furnishes counter-examples for all p. 

It is also known that certain invariant metrics on Abelian groups satisfy the 
Goemans-Linial conjecture: 

Theorem 4.14 f [10]). Let G be a finite Abelian group, equipped with an invariant 
metric p. Suppose that 2 ^ m € N satisfies mx — for all x € G. Denote 
D = c 2 (G, y/p) . Then a(G, p) < D 4 log m. 

It is an interesting open question whether the dependence on the exponent m 
of the group G in Theorem 14 . 141 is necessary. Can one construct a counter-example 
to the Goemans-Linial conjecture which is an invariant metric on the cyclic group 
C n of order nl Or, is there for every D 1 a constant K{D) such that for every 
invariant metric p on C n for which c 2 (G, ^fpj ^ D we have ci(G, p) ^ K{D)1 

One can view the above discussion as motivation for why one might consider 
the Heisenberg group as a potential counter-example to the Goemans-Linial con- 
jecture. Assuming that we are interested in invariant metrics on groups, we wish 
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to depart from the setting of Abelian groups or Banach spaces, and if at the same 
time we would like our example to have some useful analytic properties (such as 
invariance under rescaling and the availability of a group norm), the Heisenberg 
group suggests itself as a natural candidate. This plan is carried out in Section [5] 



5. Embeddings of the Heisenberg group 

The purpose of this section is to discuss Theorem 11.11 and Theorem 11.21 from the 
introduction. Before doing so, we have an important item of unfinished business: 
relating the Heisenberg group to the Sparsest Cut Problem. We will do this in 
Section |5~T| following [52] . 

In preparation, we need to recall the Carnot-Caratheodory geometry of the con- 
tinuous Heisenberg group H, i.e., R 3 equipped with the non-commutative product 
(a, b, c) • (a', b', c') = (a + a' ,b + b' , c + d + ab' — ba'). Due to lack of space, this 
will have to be a crash course, and we refer to the relevant introductory sections 
of [29 for a more thorough discussion. 

The identity element of H is e = (0, 0, 0), and the inverse element of (a, b, c) G H 
is (-a, -b, -c). The center of H is the z-axis {0} x {0} x K. For g G H the 
horizontal plane at g is defined as M g — g(R x K x {0}). An affine line L C H 
is called a horizontal line if for some g G H it passes through g and is contained 
in the affine plane H g . The standard scalar product (•, •) on H e naturally induces 
a scalar product (•, -) g on M g by (gx,gy) g — (x,y). Consequently, we can define 
the Carnot-Caratheodory metric d M on H by letting d m (g,h) be the infimum of 
lengths of smooth curves 7 : [0, 1] —> H such that 7(0) = g, 7(1) = h and for 
all t G [0, 1] we have j'(t) G -ff 7 (t) (and, the length of ~f'(t) is computed with 
respect to the scalar product (-,-) 7 (t)). The ball-box principle (see |39j ) implies 
that d H ((a, b, c), (a', b' , c')) is bounded above and below by a constant multiple of 
\a — a'\ + \b — b'\ + ^/\c — d + ab' — ba'\. Moreover, since the integer grid H(Z) is 
a discrete cocompact subgroup of H, the word metric dw on H(Z) is bi-Lipschitz 
equivalent to the restriction of d m to H(Z) (see, e.g, 19.). For 9 > define the 
dilation operator S e : H ->• H by 5 e (a, b, c) = (9a,9b,9 2 c). Then for all g, h G H 
we have d B (Sg(g), 8g(h)) — 9d M (g, h). The Lebesgue measure Jz? 3 on R 3 is a Haar 
measure of H, and the volume of a d H -ball of radius r is proportional to r 4 . 

5.1. Heisenberg metrics with isometric L p snowflakes. For ev- 
ery (a, b, c) G H and p G [1,2), define 

M p (a, b, c) = VWTWT^ • (cos (j arccos (_^=+*=^) . 

It was shown in [52] that M p is a group norm on H, i.e., for all g, h G H and 9^0 
we have M p (gh) ^ M p (g) + M p (h), M^g- 1 ) = M p (g) and M p (S e (g)) = 9M p (g). 

Thus d p (g,h) = Mpig^h) is a left-invariant metric on H. The metric d p is bi- 
Lipschitz equivalent to d m with distortion of order 1/^/2 — p (see [52]). Moreover, 
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it was shown in [52] that (H, dp ) admits an isometric embedding into L2. Thus, 
in particular, the metric space (H, di), which bi-Lipschitz equivalent to (H, d H ), is 
of negative type. 

The fact that (H, d a ) does not admit a bi-Lipschitz embedding into L p for 
any 1 ^ p < 00 will show that the generalized Goemans-Linial conjecture (see 
Section T4.3.4P is false. In particular, (H, di), and hence by a standard rescaling 
argument also (H(Z), g?i), is a counter-example to the Goemans-Linial conjecture. 
Note that it is crucial here that we are dealing with the function space L p rather 
than the sequence space i p , in order to use a compactness argument to deduce from 
this statement that there exist arbitrarily large n-point metric spaces {j^ n , d) such 
that (^£ n ,d p ^ 2 ) is isometric to a subset of L 2 , yet lim„^oo c p {^ n ) — 00. The 
fact that this statement follows from non-embeddability into L p is a consequence 
of a well known ultrapower argument (see [42]), yet for £ p this statement is false 
(e.g., £2 does not admit a bi-Lipschitz embedding into £ p , but all finite subsets 
of £2 embed isometrically into l p ). Unfortunately, this issue creates substantial 
difficulties in the case of primary interest p = 1. In the reflexive range p > 1, or for 
a separable dual space such as l\ (= Cq), the non-embeddability of H follows from 
a natural extension of a classical result of Pansu [61] , as we explain in Section 15.21 
This approach fails badly when it comes to embeddings into L%: for this purpose 
a novel method of Cheeger and Kleiner [25] is needed, as described in Section l5~3l 

5.2. Pansu differentiability. Let X be a Banach space and / : H -> X. 
Following [61j, / is said to have a Pansu derivative at x £ H if for every y EM the 

limit DJ(y) = lim -i.o (f(x5g(y)) — f(x))/6 exists, and DJ : H — > X is a group 
homomorphism, i.e., for all 2/1,2/2 € H we have D^yiy^ 1 ) = D^(yi) — £^(2/2)- 
Pansu proved [61] that every / : H — > R" which is Lipschitz in the metric d m 
is Pansu differentiable almost everywhere. It was observed in [52] [27] that this 
result holds true if the target space K™ is replaced by any Banach space with the 
Radon-Nikodym property, in particular X can be any reflexive Banach space such 
as L p for p E (1, 00), or a separable dual Banach space such as £\. As noted by 
Semmes [66j . this implies that H does not admite a bi-Lipschitz embedding into 
any Banach space X with the Radon-Nikodym property: a bi-Lipschitz condition 
for / implies that at a point x e H of Pansu differentiability, is also bi- 
Lipschitz, and in particular a group isomorphism. But that's impossible since M 
is non-commutative, unlike the additive group of X . 

5.3. Cheeger-Kleiner differentiability. Differentiability theorems fail 
badly when the target space is L\ , even for functions defined on R; consider Aron- 
szajn's example [3] of the "moving indicator function" t t— > l[o,t] G L\. For Li- 
valued Lipschitz functions on H, Cheeger and Kleiner [55] [5S] developed an al- 
ternative differentiation theory, which is sufficiently strong to show that H does 
not admit a bi-Lipschitz embedding into L±. Roughly speaking, a differentiation 
theorem states that in the infinitesimal limit, a Lipschitz mapping converges to a 
mapping that belongs to a certain "structured" subclass of mappings (e.g., linear 
mappings or group homomorphisms). The Cheeger-Kleiner theory shows that, in 
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a sense that will be made precise below, Li-valued Lipschitz functions on H are in 
the infinitesimal limit similar to Aronszajn's moving indicator. 

For an open subset U C H let Cut(J7) denote the space of (equivalences classes 
up to measure zero) of measurable subsets of U. Let / : U — > L\ be a Lipschitz 
function. An infinitary variant of the cut-cone decomposition of Corollary 13.21 
(see [25]) asserts that there exists a measure £/ on Cut(£7), such that for all 
x,y eU we have ||/(a:) - /(y)||i = J Cut(U) \1e(x) - l E (y)\dSf{E). The measure 
E/ is called the cut measure of /. The idea of Cheeger and Kleiner is to detect the 
"infinitesimal regularity" of / in terms of the infinitesimal behavior of the measure 
£/; more precisely, in terms of the shape of the sets E in the support of £/, after 
passing to an infinitesimal limit. 

Theorem 5.1 (Cheeger-Kleiner differentiability theorem [25] [28]). For almost 
every x £ U there exists a measure on Cut(H) such that for all y, z € H 
we have 

u ||/(*WW(*W)llt = f M) _ lE{z)mm (19) 

V 7cut(H) 

Moreover, the measure Y7j is supported on affine half-spaces whose boundary is a 
vertical plane, i.e., a plane which isn't of the form M g for some g G H ( equivalently, 
an inverse image, with respect to the orthogonal projection from R 3 onto IxRx {0}, 
of a line tnlxlx {0} ). 

Theorem 15.11 is incompatible with / being bi-Lipschitz, since the right hand 
side of (fT9|) vanishes when y, z lie on the same coset of the center of H, while if / 
is bi-Lipschitz the left hand side of (fT9| is at least a constant multiple of d m (y, z). 

5.4. Compression bounds for embeddings of the Heisen- 
berg group. Theorem 11.11 and Theorem 11.21 are both a consequence of the 
following result from [2T?] : 

Theorem 5.2 (Quantitative central collapse [29 ). There exists a universal con- 
stant c £ (0,1) such that for every p £ H, every 1-Lipschitz f : B(p,l) — > L\, 
and every e £ (0, 4), there exists r e such that with respect to Haar mea- 
sure, for at least half of the points x £ B(p, 1/2), at least half of the points 
(xi,X2) £ B[x,r) x B(x,r) which lie on the same coset of the center satisfy: 

SOT- 

It isn't difficult to see that Theorem 15 . 2 1 implies Theorem 11.11 and Theorem 1 1.2 1 
For example, in the setting of Theorem 11.11 we are given a bi-Lipschitz embedding 
/ : {l,...,n} 3 — > Li, and using either the general extension theorem of [ST] or a 
partition of unity argument, we can extend / to a Lipschitz (with respect to d H ) 
mapping / : [l,n] 3 — > L\, whose Lipschitz constant is at most a constant multiple 
of the Lipschitz constant of /. Theorem 15.21 (after rescaling by n) produces a pair 
of points y,z £ [l,n] of distance > y/ri, whose distance is contracted under / 
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by ^ (log?i) c . By rounding y, z to their nearest integer points in {1, . . . , n} 3 , we 
conclude that / itself must have bi-Lipschitz distortion > (logn) c . The deduction 
of Theorem 11.21 from Theorem 15.21 is just as simple; see |29j . 

Theorem 15.21 is a quantitative version of Theorem 15.11 in the sense it gives a 
definite lower bound on the macroscopic scale at which a given amount of collapse 
of cosets of the center, as exhibited by the differentiation result (fT9|) . occurs. As 
explained in [29] Rem. 2.1], one cannot hope in general to obtain rate bounds in 
differentiation results such as (TIT?]) . Nevertheless, there are situations where "quan- 
titative differentiation results" have been successfully proved; important precursors 
of Theorem 15.21 include the work of Bourgain [17] . Jones [43], Matousek [57], and 
Bates, Johnson, Lindenstrauss, Preiss, Schechtman [13]. Specifically, we should 
mention that Bourgain [17] obtained a lower bound on e > such that any embed- 
ding of an e-net in a unit ball of an n-dimensional normed space X into a normed 
space Y has roughly the same distortion as the distortion required to embed all of 
X into Y , and Matousek [57] , in his study of embeddings of trees into uniformly 
convex spaces, obtained quantitative bounds on the scale at which "metric differ- 
entiation" is almost achieved, i.e., a scale at which discrete geodesies are mapped 
by a Lipschitz function to "almost geodesies". These earlier results are in the 
spirit of Theorem 15.21 though the proof of Theorem 15.21 in [33] is substantially 
more involved. 

We shall now say a few words on the proof of Theorem l5.2| for lack of space this 
will have to be a rough sketch, so we refer to [33] for more details, as well as to the 
somewhat different presentation in [30] . Cheeger and Kleiner obtained two different 
proofs of Theorem 15. II The first proof [25] started with the important observation 
that the fact that / is Lipschitz forces the cut measure £/ to be supported on 
sets with additional regularity, namely sets of finite perimeter. Moreover, there 
is a definite bound on the total perimeter: J 0ut /y\ PER(£7, B(p, l))dEf(E) < 1, 
where PER(£', B(p, 1)) denotes the perimeter of E in the ball B(p, 1) (we refer 
to the book [2] , and the detailed explanation in (25] [29] for more information on 
these notions). Theorem l5.2l is then proved in 25! via an appeal to results [351 131)] 
on the infinitesimal structure of sets of finite perimeter in EL A different proof of 
Theorem 15. 21 was found in [28] . It is based on the notion of metric differentiation, 
which is used in [28] to reduce the problem to mappings / : H — > L\ for which the 
cut measure is supported on monotone sets, i.e., sets E C H such that for every 
horizontal line L, up to a set of measure zero, both L n E and L n (H \ E) are 
cither empty or subrays of L. A non-trivial classification of monotone sets is then 
proved in [28]: such sets are up to measure zero half-spaces. 

This second proof of Theorem 15.21 avoids completely the use of perimeter 
bounds. Nevertheless, the starting point of the proof of Theorem 15 . 2 1 can be viewed 
as a hybrid argument, which incorporates both perimeter bounds, and a new classi- 
fication of almost monotone sets. The quantitative setting of Theorem 15.21 leads to 
issues that do not have analogues in the non-quantitative proofs (e.g., the approx- 
imate classification results of "almost" monotone sets in balls cannot be simply 
that such sets are close to half-spaces in the entire ball; see [29, Example 9.1]). 

In order to proceed we need to quantify the extent to which a set E C B(x, r) 



Embeddings of the Heisenberg group 



21 



is monotone. For a horizontal line LCI define the non-convexity NCg( lr .)(£, L) 
of (E,L) on B(x,r) as the infimum of $ LnB ( x r) |lj - l_EninB r (x)| d%\ over all 
sub- intervals / CLfl B r (x). Here is the 1-dimensional Hausdorff measure on 
L (induced from the metric d m ). The non-monotonicity of (E,L) on B(x,r) is 

defined to be NM B(ljr) (£,L) d = NC B(x>r) (£;, X) + NCb^^H \ E, L). The total 
non-monotonicity of E on B(x, r) is defined as: 

NM BM (£) = i / NM B(V) (£,L)*(L), 

r Jlines(B(x,r)) 

where lines ([/) denotes the set of horizontal lines in H which intersect U, and 
AT is the left invariant measure on lines(H), normalized so that the measure of 
lines(S(e, 1)) is 1. 

The following stability result for monotone sets constitutes the bulk of 29 : 

Theorem 5.3. There exists a universal constant a > such that if a measurable 
set E C B(x,r) satisfies NM B ( x r ^(E) e a then there exists a half-space V such 
that 

J? 3 ttEnB Er (x))AV) 1/3 
J? 3 (B er (x)) ^ ■ 

Perimeter bounds are used in [291 130) for two purposes. The first is finding 
a controlled scale r such that at most locations, apart from a certain collection 
of cuts, the mass of £ t is supported on subsets which satisfy the assumption of 
Theorem 15.31 (see [30} Sec. 9]). But, the excluded cuts may have infinite measure 
with respect to £/. Nonetheless, using perimeter bounds once more, together with 
the isoperimetric inequality in H (see [60l[21]), it is shown that their contribution 
to the metric is negligibly small (see [301 Sec. 8]). 

By Theorem l5.3( it remains to deal with the situation where all the cuts in the 
support of S/ are close to half-spaces: note that we are not claiming in Theorem l5.3l 
that the half-space is vertical. Nevertheless, a simple geometric argument shows 
that even in the case of cut measures that are supported on general (almost) half- 
spaces, the mapping / must significantly distort some distances. The key point 
here is that if the cut measure is actually supported on half spaces, then it follows 
(after the fact) that for every affine line L, if x±, x%, %3 € L and X2 lies between 
Xl and x 3 then \\f( Xl ) - /(x 3 )||i = \\f( Xl ) - f(x 2 )\\ 1 + ||/(.t 2 ) - /(.t 3 )||i. But if 
L is vertical then d m \L is bi-Lipschitz to the square root of the difference of the 
z-coordinates, and it is trivial to verify that this metric on L is not bi-Lipschitz 
equivalent to a metric on L satisfying this additivity condition. For the details of 
(a quantitative version of) this final step of the argument see [30l Sec. 10]. 
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